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I. INTRODUCTION 

Over the past decade, network theory has contributed 
significantly to improve our understanding of collective 
dynamics in networks with complex topologies. The sim- 
plicity of the network representation, where the inter- 
actions and interacting elements are mapped to edges 
and vertices, respectively, stimulated its use on a num- 
ber of systems, ranging from physical, biological to social 
and engineering systems [TTH2|. A large number of nat- 
ural and man-made systems have been shown to be nei- 
ther entirely regular nor entirely random, but to exhibit 
prominent topological properties, such as short average 
path lengths and a high level of clustering. 

Recently, weighted networks, in which each edge is 
assigned a weight, have been shown to allow a bet- 
ter description of many natural and man-made systems 
[U [H fTMIT] . and particularly of functional networks 
underlying various brain pathologies [TM2"5] . Functional 
brain networks are usually derived from either direct or 
indirect measurements of neural activity. Network ver- 
tices are associated with sensors that are placed such 
as to sufficiently capture the dynamics of different brain 
regions. The connectedness between any pair of brain re- 
gions is assessed by evaluating some linear or non-linear 
interdependencies between their neural activities [24H27] . 
Such networks can be regarded as complete weighted net- 
works, in which all possible edges exist. 

For empirical networks, interpreting findings is not 
without challenges. Findings of some network character- 
istics may be influenced by statistical fluctuations (like 
measurement or environmental noise) and systematic er- 
rors (which might, for example, be attributed to the data 
acquisition or to the selected way to construct a network 
from the data). Moreover, existing methods of analysis 



may be misapplied or misinterpreted, which may lead to 
inappropriate conclusions, as pointed out in Refs. |28H30j . 
Standard approaches to uncovering influencing factors 
like background measurements, repeated measurements, 
or selective manipulation of the investigated system may, 
however, not be feasible in empirical network studies. An- 
other strategy is the comparison with the expected result 
for appropriate null models. This result can either be de- 
rived analytically [5TH55] or be extracted from samples 
that are obtained by Monte Carlo simulations [14 ^l34H42] . 
In the following we refer to these samples as 'surrogate 
networks', in accordance with a similar approach, that is 
well established in time series analysis [HI |H] . 

We here propose an efficient iterative procedure to 
generate strength-preserving surrogate networks for in- 
vestigations of complete weighted networks. This paper 
is organised as follows. In Sec. [TT] we describe our ap- 
proach to surrogate networks and introduce our pro- 
cedure. We show that it generates approximately uni- 
formly distributed surrogates for a sufficient number of 
iterations and propose a method to determine this num- 
ber. With strength-preserving surrogates and weight- 
preserving surrogates we reanalyse functional networks 
of the human brain and investigate the International 
Trade Networks (Sec. III). We demonstrate that surro- 
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gates can provide additional information about network- 
specific characteristics and thus aid in their interpreta- 
tion. Finally, in Sec. |IV| we draw our conclusions. 



II. METHODS 
A. Definitions and Measures 

We consider undirected, weighted networks with non- 
negative edge weights and treat them as complete net- 
works, i.e., we consider every possible edge to exist. A 
network of this type with n vertices is fully described 
by its symmetric non- negative weight matrix W £ R™ x ™, 
whose entry Wij is the weight of the edge connecting ver- 
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tices i and j. For practical purposes we define the diago- 
nal elements Wu as zero. The strength of a vertex is de- 
fined as the sum of all adjacent weights Si ■— Ej=i Wij. 
We consider the distribution of all edge weights of a net- 
work W := {W12, Wi 3 , W23, ■ ■ ■ , W n -i, n } and the distri- 
bution of all vertex strengths S := {Si, . . . S n }. 

For the weighted clustering coefficient of node i we use 
the following definition [55]: 

E i/w^WjkWki 

1 ' (n - 1) (n - 2) max (W) ' 

This definition has the advantage that the value of the 
clustering coefficient is continuous for Wij — > [35] . We 
also consider 

E i/Wi-jWjkWki 

Ki ■■= ~, ttt ^- = Ci max (W) . 

[n — 1) (n— 2) 

For the weighted shortest path between vertices i 
and j wc follow Ref. [37] and consider the inverse of the 
weight of an edge as the length of that edge. 

As network specific characteristics we here investigate 
the averages C, K, and L of Cj, Ki, and Zy, respectively. 

B. Network Surrogates 

We consider the extent, to which distributions of local 
network properties (such as W or S) contribute to the 
network-specific characteristic under investigation (such 
as C, K, or L). In many situations this quantity may 
reveal important aspects of the network or of the applied 
methods: 

• If the edge weights — instead of being determined by 
the investigated system — are independently drawn 
from some distribution (e.g., due to excessive 
noise), the value of any characteristic can only be 
attributed to the weight distribution W and to co- 
incidence. 

• Edge weights defined from the data are often nor- 
malised by multiplication with a factor, that de- 
pends on a distribution of local properties (e.g., 
the average strength S). This usually changes the 
extent, to which this distribution contributes to 
network-specific characteristics. Sign and magni- 
tude of this change may help to decide, whether 
a normalisation works as intended. 

• If the weight of an edge only depends monotonically 
on some intrinsic property of its adjacent vertices 
(e.g., in fitness model networks [HISS]), the value 
of network-specific characteristics may be mainly 
attributed to the strength distribution S. 

• If one local entity (e.g., an edge weight) dramati- 
cally exceeds the others in some local property (e.g., 



if the maximum edge weight is by far larger than 
the other weights), it may dominate a network- 
specific characteristic. As this influence is mediated 
by the distribution of this property, the network- 
specific characteristic would be mainly attributed 
to this distribution. 

• If the value of a network-specific characteristic can 
be fully attributed to the distribution of a local net- 
work property, it should be considered, whether in 
this case a network approach to the data is overly 
complicated and more simple properties may be re- 
garded instead. 

To decide, to which extent a characteristic of a given 
network (the 'original network') is determined by the dis- 
tribution of a local network property, it can be compared 
to the values for surrogates of this network, which are 
randomised under the constraint that this distribution is 
preserved. Moreover, the null hypothesis can be tested, 
that the network under consideration is random under 
the constraint of the distribution of the local property. 
Details about null hypothesis tests based on surrogates 
can be found in the literature, e.g., in Ref. [44] . 

We here consider surrogate methods, which exactly 
preserve either the strength distribution S or the weight 
distribution W (preserving both would in most cases only 
leave one possible surrogate network, namely, the origi- 
nal network). We aim at methods that sample uniformly 
from the set of all networks with a given S or W, respec- 
tively. The corresponding null hypotheses are 

Hg. The network under consideration is random under 
the constraint of its strength distribution S . 

-ffyv : The network under consideration is random under 
the constraint of its weight distribution W. 

Note that preserving the strength distribution is equiva- 
lent to preserving the strength sequence when regarding 
network-specific properties, since they are not affected by 
a permutation of the vertices. While the generation of 
uniformly-distributed weight-preserving surrogates can 
be achieved by a reshuffling of the weights [21 20] , our 
method to generate strength-preserving surrogates is de- 
scribed in the following. 

C. Strength-preserving surrogate networks 

The constraint of a given strength sequence of an undi- 
rected, weighted, and complete network with n vertices 
can be expressed by a system of n linear equations with 
the m '■= \n (n — 1) edge weights as variables. Given 
the non-negativity of the edge weights the set of solu- 
tions to this set of equations represents a convex poly- 
tope O € K m [50], each point of which corresponds to 
a network. Thus the problem of generating strength- 
preserving surrogates is equivalent to that of picking ran- 
dom points from a polytope. Some exact solutions to 
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this problem (e.g., utilizing triangulation) have been pro- 
posed [51], but due to computational burden they may 
be applied to networks with a very small number of ver- 
tices only. Hit-and-Run samplers [53] are a group of iter- 
ative Monte-Carlo procedures providing samples from a 
bounded region, such as a polytope. The distribution of 
these samples has been shown to approximate the uni- 
form distribution on that region under certain require- 
ments and for a sufficient number of iterations [53] . We 
here propose a Hit-and-Run sampler, that is specialised 
to the problem of generating strength-preserving surro- 
gates. In Appendix [A] we present a mathematical back- 
ground to this procedure and show, that it fulfils the 
requirements for sampling approximately uniform. 




1 . Procedure 

We propose the following procedure for sampling from 
the set fi of all networks with a given strength sequence: 

1. Acquire some network P° £ fl and set the counter 
h = l. 

2. Randomly select four pairwise distinct vertex in- 
dices i, j, k, I £ {1, . . . , n}. 

3. Pick a number £ from the uniform distribution 



h-l\ 



Let 



P h = ph-i^ but set ph = ph-i + ^ p h k = 

c, p& = pir 1 c 



PjT - C, and P* = P, 
(cf. Fig.0. 



v 

h-l 
kl 



4. If h < t, raise h by 1 and continue at [2j Otherwise 
let P* be the surrogate network. 

The interval, to which £ is limited, is the maximum 
one, such that the transformed network does not contain 
any negative weights. This procedure can be regarded as 
an extension of previously suggested null model samplers 

[Si Eg 022 si]. 

In principle, P* as generated by our procedure is statis- 
tically dependent on P°. This dependence becomes neg- 
ligible, however, for a sufficiently high number of trans- 
formations t su [ (to be determined in Sec. II C 2 1 . Most 



computational effort has to be spent reducing this statis- 
tical dependence. 

Concerning the acquisition of the starting point P , 
the most direct approach would be to select P° = O Vi, 
where O is the original network and the subscript in- 
dex here indicates different surrogates to be generated. 
This way, however, the reduction of statistical depen- 
dence achieved when generating surrogate P*_i is dis- 
carded when generating surrogate P*. To benefit more 
from previously achieved reductions of dependence, we 
therefore employed schemes, where P® is a previously 
generated surrogate for most i (e.g., P° = P*_i). Out of 
several such schemes, the one depicted in Fig. [2] required 
the smallest number of total iterations to generate 4096 



FIG. 1. 'Tetragon transformation' of the network P h_1 to P h . 
In a randomly selected tetragon (i, j, k, I) a random number 
£ is added to the weights of two opposing edges (Py -1 , P^ -1 ) 
and subtracted from the weights of the others (PJjT , P/; -1 )- 
Other edge weights remain unaltered. Edge weights are en- 
coded as line thickness. 
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FIG. 2. Scheme used to generate surrogates V = 
{Pi, . . . P4096} from an original network O. Short arrows rep- 
resent a 'step' consisting of t tetragon transformations, long 
arrows represent ten such steps, A\, . . . , A14 are auxiliary net- 
works. (Superscripts are omitted for better readability.) 



surrogates with negligi ble dep endencies (according to the 
test presented in Sec. 



II C 2 ) . This scheme was roughly 



ten times faster than the direct generation of surrogates 
from the original network (P® = O). 



2. Numerical estimation of the necessary number of 
transformations 

In order to estimate, which number t of transfor- 
mations is sufficient, we employed the following proce- 
dure to test whether surrogate networks are sampled ap- 
propriately. It estimates the likelihood that surrogates 
V = {Pi, . . . , P a } (a £ N) are picked independently from 
the uniform distribution. 

1. Select parameters b, c £ N with c <C a. 

2. Generate a surrogates Q — {Q\, . . ■ ,Q a } with a 
'reference method' that is known to pick surrogates 
independently from the uniform distribution. Pick 
some random testing points 1Z = {Pi, . . . , Rb} from 
the polytope 17, e.g., by using the reference method. 
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FIG. 3. Number of sufficient transformations t su f per step for 
the generation of 4096 surrogate networks from four toy net- 
works with n vertices. For comparison, the solid line displays 
twice the number of edges. 



3. For alH £ {1, . . . , b}, determine e 2 ; > such that ex- 
actly c surrogates from Q are in the ei-ball around 

Ri- 

4. For all i £ {1, . . . , b}, let fcj be the number of sur- 
rogates from V in the ei-ball around Ri. 

5. Let p(k) := QL^ 1 and ^ X := 

I (gp(*o) (jj/w) (|b #(i)2 ) • Thc 

expected value of \ is 1 if Pi, ■ ■ ■ ,Pa are picked 
independently from the uniform distribution. 
Otherwise and if a and b are sufficiently high and c 
is sufficiently low, the expected value of x 1S lower 
than 1 (cf. App. [Bjfor details). 

To estimate the necessary number t of transformations 
per step (cf. Fig. [2]) , we regarded four toy networks with 
random weights for each number of vertices between 25 
and 149. We raised t from 1024 successively by a factor 
of 2 ' 2 . For each t we generated several realisations V of 
4096 surrogates each and if % > 0.96 for each V, we set 
t su { = t. As a reference method we used the same method 
with t = 2 19 , which we assumed to generate appropri- 
ately sampled surrogates. To avoid the reference Q being 
statistically outlying, however, we omitted it, if it scored 
X < 0.98 in a test against another reference generated by 
the same method. For comparison, \ — 1.01±0.03 for the 
reference methods in a test against themselves. In Fig. [3] 
we show the number of sufficient transformations t su { for 
different numbers of vertices of the toy networks. We ob- 
serve that in most cases our method generates appropri- 
ate surrogates if t is approximately twice the number of 
edges in the original network. 

Generating 4096 surrogate networks with t — 2 16 
transformations per step took 123 s on a PC with 
829MFLOPS (2 GHz). 



III. SURROGATE ANALYSIS OF EMPIRICAL 
NETWORKS 

A. Functional brain networks 

Characterizing anatomical and functional connections 
in the human brain with approaches from network the- 
ory has been a rapidly evolving field recently [7H9]. Re- 
search over the past years indicates that both physiolog- 
ical and pathophysiological states of the brain are re- 
flected by topological aspects of functional brain net- 
works. Mostly the clustering coefficient, the average 
shortest path length or similar measures had been used 
to characterise these networks. Findings that had been 
achieved so far can be regarded as important since they 
provide new insights into properties of normal and patho- 
logic functional brain networks. 

In Ref. [55] functional brain networks derived from 
electroencephalographic (EEG) recordings during differ- 
ent states of vigilance (eyes opened and eyes closed) of 21 
epilepsy patients and of 23 healthy control subjects had 
been analysed using the clustering coefficient C and the 
average shortest path length L. Differences in these char- 
acteristics could be observed between epilepsy patients 
and healthy control subjects as well as between states of 
vigilance. We here reanalysis exemplary networks from 
an epilepsy patient and a healthy control subject, and 
with surrogate networks we investigated to which extent 
the observed findings can be attributed to the weight dis- 
tribution W or strength distribution S . 

Details of the data and of recording and analysis tech- 
niques are fully described in Ref. [22| . Briefly, EEG data 
had been recorded for 30 min with n = 29 electrodes [S3] 
placed according to the 10-10 system of the American 
Electroencephalographic Society with the right mastoid 
as physical reference (sampling rate: 254.31Hz; 16 bit 
A/D conversion; bandwidth: 0-50 Hz). During one half of 
the recording time each subjects had their eyes opened 
or closed, respectively. 

EEG signals were split into consecutive non- 
overlapping segments of 4096 data points (16.1s) each. 
For each segment we extracted the phases in a frequency- 
selective way using Morlet wavelets centred in the so- 
called alpha band (8-13 Hz) [55J and calculated the mean 
phase coherence R^ [56] as a measure for interdepen- 
dence between signals recorded at sensors i and j (for 
simplicity's sake we omit the dependence on the seg- 
ment in the following). is confined to the interval 
[0, 1] where R^ = 1 indicates fully synchronised sys- 
tems. Network vertices were identified with sensors and 
edges between vertices i and j were assigned the weight 
Wij = Rij — R + 1, where R is the average over all Rui 
with k ^ I. For each of these networks, we generated 
4096 weight-preserving surrogates and 4096 strength- 
preserving surrogates and calculated the clustering coeffi- 
cients C and K as well as the average shortest path length 
L for the original and the surrogate networks. Note, that 
for many applications, such as a test of a null hypothesis, 
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FIG. 4. Temporal evolutions of clustering coefficients C (first row) and K (fifth row) and average shortest path length L 
(third row) of functional brain networks (black solid lines) and of weight-preserving surrogates (red dotted lines) and strength- 
preserving surrogates (green dashed lines) for these networks. For L we show the margins of standard deviation over 4096 
weight- preserving surrogates (red dotted lines). Standard deviations of C and K over the weight-preserving surrogates were 
too small to be displayed, the maximum standard deviation of 0, K, and L over the 4096 strength-preserving surrogates was 
0.02 each. For comparison, for the original networks we show the temporal evolutions of the inverse of the maximum weight 
maxfwi ( secon d row) and of the standard deviation a (W) of the edge weights (fourth row). 
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fewer surrogates may suffice [31] . 

In Fig. [I] we show the temporal evolutions of C, K, 
and L for the functional networks of the epilepsy patient 
and the healthy control subject and for the correspond- 
ing weight- and the strength-preserving surrogates. For 
both subjects we observed, on average, higher values of 
L and lower values of C and K during the eyes-closed 
condition. There were, however, no clear-cut differences 
in C , K, and L between the epilepsy patient and the 
control subject. L during the eyes-open condition as well 
as C and K during the complete observation time were 
approximately equal for the original networks and the 
weight-preserving surrogates. A property of the weight 
distribution W, that we could identify as strongly corre- 
lated to C, was the inverse of the maximum edge weight 
max(w) ' ^ e attribute this strong influence of max (W) 
mainly to its utilisation as a normalisation factor when 
calculating C, since K did not exhibit such a strong corre- 
lation to max 1 ( -yy) • The temporal evolution of L was similar 
to that of the standard deviation of the edge weights of 
the original network a (W), while the temporal evolution 
of K was opposite to that of a ( W) . 

Despite the mostly similar temporal evolutions of C, 
K, and L for the original and the weight-preserving sur- 
rogate networks, these characteristics always assumed 
higher values for the original networks than for any of the 
4096 surrogates. Thus we can reject the null hypotheses 
i/yv, that the original networks are random under the 
constraint of their weight distribution W. 

When compared to the strength-preserving surrogates 
C, K, and L always assumed clearly higher values for 
the original networks, and we could not observe compa- 
rable temporal evolutions. The null hypotheses H$, that 
the original networks are random under the constraint of 
their strength distribution 5, can be rejected as well. 

Our findings indicate that the clustering coefficient C 
of the functional brain networks investigated here is pre- 
dominantly determined by properties of the weight distri- 
bution W. Similar conclusions can be drawn for the clus- 
tering coefficient K and the average shortest path length 
L, for the latter, however, for the eyes-open condition 
only. In contrast, the clear differences between original 
and surrogate networks seen for L during the eyes-closed 
condition indicate that a considerable part of the value 
of this network-specific characteristic is not determined 
by the weight distribution VV of the functional brain net- 
works. Whether these findings hold for all the data in- 
vestigated in Ref. [35] needs further investigations, which 
will be published elsewhere. 

B. International Trade Networks 

As a second example we investigated the clustering 
coefficients C and K as well as the average shortest 
path length L of the International Trade Networks (ITN) 
[i6l[57PT] for the years 1948 to 2000. The vertices of the 
ITNs are countries and the edge weights represent the 



amount of trade between the corresponding countries. 
The number of vertices n of the ITNs changes annually, 
growing from n = 73 in 1948 to n = 187 in 2000. Since 
some binary properties of ITN of 1995 could be explained 
by a fitness model [SH] , it is conceivable that the structure 
of a weighted ITN is also governed by vertex-intrinsic 
parameters, which are reflected by the countries' total 
trade activity. Since the latter corresponds to the ver- 
tex strengths, strength-preserving surrogates might de- 
tect such an influence. As the number of vertices n is 
preserved alongside with the strength distribution S and 
with the weight distribution W, respectively, strength- 
or weight-preserving surrogates might help to detect a 
possible influence of this number on the network-specific 
characteristics. 

To construct the networks from the data we followed 
Refs. (46j |6T] to determine the trade flow between two 
countries i and j: 

Fij = 2 {Eij + Iij + Eji + Iji) 

where Eij and denote the export and import from 
country i to country j. We determined the weights as 
Wij = Fij/F, where F is the average over all F%j with 
i =/= j. In each year we omitted countries, of which no 
trade was recorded at all [62]. 47% of the edges of these 
networks were zero- weight edges. 47% of this zero- weight 
edges were in turn to be attributed to missing data. 
The latter (and probably some of the other zero-weight 
edges) are likely to correspond to small or negligible trade 
[57] , For each year we calculated C, K, and L of the 
ITNs as well as of 4096 weight-preserving surrogates and 
strength-preserving surrogates each. 

In the top row of Fig. [5] we show the temporal evo- 
lutions of K and L for the ITNs and for the weight- 
preserving surrogates and strength-preserving surro- 
gates. For most years both characteristics of the ITNs 
clearly deviated from the respective values of the surro- 
gates, and we thus can reject the null hypotheses ffyy 
and Hs, that the ITNs are random under the constraint 
of their weight distribution W or strength distribution S, 
respectively. 

We observed, however, considerable similarities in the 
temporal evolutions of L for the ITNs and for the 
strength-preserving surrogates, which approximately dif- 
fered by a constant factor only (note, that the curves 
are almost parallel in the semi-logarithmic plot). Hence 
it should be considered that the temporal changes of L 
can mainly be attributed to changes of S (i.e., of the an- 
nual relative trade volumes and the number of countries) , 
though the absolute value of L cannot be attributed to 
them. The similarities of the temporal evolutions of K 
between the ITNs and the surrogates are less dominant, 
but apparent for both types of surrogates. This indicates 
that the temporal changes of K can only partially be at- 
tributed to changes of W or 5. In the bottom right part 
of Fig. [5] we show the temporal evolution of l/n, which 
we observe to be similar to that of K. Increases of the 
number n of countries, however, mostly coincide with sep- 
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1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000 
year year 

FIG. 5. Top: temporal evolutions of clustering coefficient K (right) and average shortest path length L (left) of the International 
Trade Networks for the years 1948 to 2000 (black solid lines). Also shown are the margins of standard deviation for 4.096 weight- 
preserving surrogates (red dotted lines) and 4.096 strength-preserving surrogates (green dashed lines) for these networks. Bottom 
left: the same for the clustering coefficient C. Bottom right: the inverse of the number of vertices A for comparison. 



arations of countries, which in turn may also affect W or 
S. Thus our findings do not resolve whether there is a di- 
rect influence of n on K. The similarities of the temporal 
evolutions of K and L between the original networks and 
the surrogates indicate that there are only few changes in 
properties not to be attributed to the strength or weight 
distribution, respectively, and thus affirm that the ITNs' 
structure is mainly time- invariant [601 161] . 

In the bottom left part of Fig. [5] we show the tem- 
poral evolutions of C for the ITNs and for the weight- 
preserving surrogates and strength-preserving surro- 
gates. We observe strong similarities in the temporal evo- 
lutions of C for the ITNs and the weight-preserving sur- 
rogates as well as of ma x(w) ( n °t snown here). These sim- 
ilarities and the fact that they are less pronounced for K 
affirm our findings in Sec. Ill A that max(W) strongly 
influences C due to its use as a normalisation constant. 



IV. CONCLUSIONS 

We proposed a method to efficiently generate strength- 
preserving surrogates for complete weighted networks. 
With strength-preserving surrogate networks and weight- 
preserving surrogate networks we reanalysis functional 
brain networks and investigated the International Trade 
Networks. While we were examplarily regarding the clus- 
tering coefficient and the average shortest path length, 
surrogate networks can also be applied to investigate 



other network-specific characteristics. 

For functional brain networks derived from an epilepsy 
patient and a healthy control subject during different 
states of vigilance we observed that the clustering coeffi- 
cients C and K as well as the average shortest path length 
L are strongly dominated by properties of the weight dis- 
tribution VV, namely, its standard deviation and its maxi- 
mum. Thus, previously reported differences between sub- 
jects as well as between states may be more easily identi- 
fiable by merely analysing properties of the distribution 
of interaction strengths W. Also, given the strong de- 
pendence of the clustering coefficient C on the maximum 
weight, other normalisations for C may be more appro- 
priate for a comparison of networks. It is even conceivable 
that, if the respective maximum weight of the networks 
under comparison is always held by the same edge, a 
comparison of the weights of this single edge suffices to 
identify differences. In such network approach to 

the data is questionable, since it is an overly complicated 
description of a simple aspect of the data. 

For the International Trade Networks we observed that 
relative changes of the average shortest path length over 
the period 1948 to 2000 were reflected by the strength- 
preserving surrogates. Similar results could also be ob- 
tained for the clustering coefficient K, whose temporal 
evolution was also similar to that of the number of ver- 
tices. This led us to assume that the relative changes 
were reflecting alterations of the vertex strengths, which 
are proportional to the trade volumes of the respective 
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countries, or of the number of vertices. Further investi- 
gations are necessary to clarify the impact of these influ- 
ences on the ITNs' characteristics. 

For both sets of empirical networks we could reject the 
null hypotheses corresponding to the applied surrogates 
in most cases. This indicates that the networks are not 
only determined by their weight or strength distributions. 
Our findings demonstrate that surrogate networks pro- 
vide additional information about network-specific char- 
acteristics and thus can aid in their interpretation. 



P is approximately sampled from the uniform distribu- 
tion, if t is sufficiently large and if any two points of CI 
are accessible from each other via some selected transfor- 
mations as in step [3] 53"] . 

In our method, step [3] of the Hit-and-Run procedure 
corresponds to a 'tetragon transformation' (cf. Fig. [T]) 
and each vector in T> corresponds to a tetragon. Such 
a vector has exactly four non-zero components, each of 
which has the same absolute value and corresponds to an 
edge of the tetragon. 
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Appendix A: Mathematical Background 

The n linear equations, which correspond to the con- 
straint of a given strength sequence of a (undirected, 
weighted, and complete) network are 



(Al) 



Since there are m := \n (n — 1) variables (the edge 
weights) in this system of linear equations, it has an 
(m — n)-dimensional subspace of solutions, which we de- 
note by A. The set of non-negative solutions is the poly- 
tope f2. For simplicity's sake we do not regard cases, in 
which the A-volume of is 0, e.g., if Sj = for any i 
or if the network is star-shaped (i.e., there is one vertex, 
to which all non-zero- weight edges are adjacent). With 
these omissions f2 is an (to — n)-polytope and A is its 
affinc hull. 



1. Hit-and-Run Samplers 

The general procedure of a Hit-and-Run sampler for 
picking a random point or network, respectively, from f2 
is [53] 

1. Acquire some point P° £ f2 and set the counter 
h=l. 

2. Pick a direction D from the uniform distribution 
over a set of directions T> C K m . 

3. Pick a number Q randomly from the uniform distri- 
bution on {C € MjP' 1 - 1 +(D en} and set P h = 
P h - X +(D. 

4. If ft, < t, raise h by 1 and continue at [2] Otherwise 
let P* be the random point. 



Accessibility of the polytope Q by tetragon 
transformations 



In this section we show that any two points of f2 are 
accessible from each other via tetragon transformations, 
which is required for our Hit-and-Run sampler to sample 
uniformly from fl. For this purpose we first show that 
there is a basis consisting only of vectors corresponding 

From this follows that all vec- 



to tetragons (App. A 2 a 



tors corresponding to tetragons form a spanning set of A 
and thus each two points of the relative interior of fi are 
accessible from each other. Then we show that each point 
on the relative boundary of the polytope can be modi- 
fied into one in the relative interior just with tetragon 
transformations (App. A2b) and vice versa. Note, that 



despite this the probability, that any point on the relative 
boundary is sampled, is 0. The result is, however, impor- 
tant, if the original network is on the relative boundary. 
Also, points in the relative interior near such an inacces- 
sible point may only be accessible with a large number 
of transformations. 



a. A basis of A 



Equation Al written as a matrix equation contains the 



following n x m-matrix, if the variables (i.e., the edge 
weights) arc ordered as described below (zeros are omit- 
ted): 



1- 



1" 
'1 1- 
••1 



1 



The i-th row of this matrix corresponds to the (right- 
hand side of) equation Si — J2j=i Wij an d nas an en- 
try 1 in all rows corresponding to weights Wij (j € 
{1, . . . , n} \ {i}). Each column has exactly two non-zero 
entries, namely, the column corresponding to the edge 
weight Wij contains a 1 in the rows i and j. The selected 
ordering of the weights may be separated into n—1 groups 
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(as indicated by grey vertical lines), such that the i-th 
group contains the edges Wi <n -i+i, ■ ■ ■ , Wn-i,n-i+l- The 
second group's internal order is reversed to simplify the 
following conversions, which aim at determining a basis 
of A. 

Subtracting all prior rows from the last one and then 
dividing the last row by —2 yields 



1 



1 



1 



1 

1 1 



•1 



Subtracting the last row from the two preceding rows 
yields 



(1 



1 



1 



1 

'1 

1 1 

-1 



1- 



b. Accessibility of the relative boundary of the polytope Q by 
tetragon transformations 

The points on the relative boundary of are exactly 
those, which have at least one component that is zero. 
Therefore, to show that any point on the relative bound- 
ary of fi can be transformed into a point on the rela- 
tive interior of f2 by tetragon transformations (and vice 
versa), it is sufficient to show that any zero-component 
(i.e., zero- weight edge) can be eliminated by tetragon 
transformations without creating a new one. 

Let Wij be the zero-weight edge to be eliminated. Since 
Si,Sj > 0, there must be at least one non-zero- weight 
edge adjacent to the vertices % and j (denoted by Wik 
and Wit, respectively). 

I. If k I, the tetragon transformation that raises 
Wij and Wu by C = § min (Wik, Wu) and lowers 
Wik an d Wu by the same amount eliminates the 
zero-weight edge Wij without creating a new one. 

II. If k and I can only be chosen such that k = I, there 
must be at least one non-zero-weight edge W pq with 
both p and q being unequal to both k (otherwise 
the network would be star-shaped) and to either i 
or j (otherwise n = 3). In this case first Wi p or 
Wj P , respectively, and then Wij can be eliminated 



according to |Tj 



Thus the column vectors of the following matrix are a 
basis of A: 



1 



1- 
-1.. 



1 



1 



1 



1 

-1 -1 

-1 

1 



-1 



The basis vectors of the first two groups contain exactly 
four non-zero components each. Each basis vector in the 
remaining groups contains exactly six non-zero compo- 
nents and can be exchanged for a vector with four non- 
zero components by subtracting the vector in the first 
group that shares three non-zero components with it. 
Thus there is a basis of A only consisting of vectors 
with four non-zero components. Since any of these vectors 

n 

must solve Vi : = Wij, the edges corresponding to 

its non-zero components must form a tetragon and thus 
all basis vectors correspond to a tetragon transformation. 



3. Comparison of tetragon transformations to 
other Hit-and-Run Samplers 

Standard choices for the direction set V are the unit 
sphere (Hypersphere Directions Hit-and-Run Sampler) 
or a basis (Coordinate Directions Hit-and-Run Sampler) 
[52"] . We expect our Hit-and-Run Sampler to be faster 
than the Hypersphere Directions Hit-and-Run Sampler, 
since the latter would require a transformation of each 
direction D from a basis of A to a basis of K m . Moreover, 
all m components need to be taken into account when 
choosing £, while only four components need to be re- 
garded during each tetragon transformation. We also ex- 
pect tetragon transformations to be more efficient than a 
Coordinate Directions Hit-and-Run Sampler, since they 
form a larger direction set T> without increasing the com- 
putational burden per transformation. Also for a Coor- 
dinate Directions Hit-and-Run Sampler the requirement 
of accessibility of all points may not be fulfilled. 



4. Extension to further constraints 

For some applications it may be desirable to generate 
surrogate networks that obey constraints further than 
the preservation of strengths or non-negative weights. 
As long as tetragon transformations can transform every 
two points of the corresponding subset into each other, 
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they may be used as direction set V for the Hit-and- Run- 
Sampler. Otherwise or if in doubt, it can still be resorted 
to a Hypersphere Directions Hit-and-Run Sampler. In 
the following we provide two examples, how further con- 
straints can be incorporated into the Hit-and-Run sam- 
pler framework: 

• The constraint that the weights of the surro- 
gates may not exceed a given maximum can be 
regarded analogously to the constraint of non- 
negative edge-weights. The set of possible surro- 
gates is a smaller (to — rt)-polytope with A as affine 
hull. Thus tetragon transformations can still trans- 
form all points of the relative interior of the new 
polytope into each other. App. |A 2 b can be anal- 
ogously applied to the constraint of a maximum 
weight. Problems may arise only in the case of zero- 
weight and maximum-weight edges in the same net- 
work. 

• If the binary structure of the original network is to 
be preserved, zero-weight edges remain unaltered 
and the set of possible surrogates is a bounding 
sub-polytope of f2. For sparse networks, however, 
the requirement of accessibility of all points with 
tetragon transformations may not be fulfilled. 



Appendix B: Properties of the test statistics % 

If points Q = Qi, . . . , Q a (a € N) are picked indepen- 
dently from the uniform distribution on 17, the probabil- 
ity 7r that c of them are in a given e-ball (or any other 
subset of tt) is binomially distributed: 



If now Pi , . . . , P a are also picked independently from the 
uniform distribution, the probability p (k) that k of them 
are in the same e-ball is proportional to 



J B (c, g, a) B (k, g, a) dg =; p (k) . 



Multiple integrations by parts yield 



P (k) 



1 fo\ fa\ { 2a 



2a + 1 V c J \ k J \c + k 



and normalisation finally results in 




si=0 



p{k)=p(k) 

withp(fc) := (l)(£y\ 

For the calculation of x several e^-balls (i € {0, . . . b}, 
b G N) around randomly picked points Ri , . . . , Rb are 
regarded, each containing exactly c points from Q. The 
points V = Pi , . . . , P a were picked independently from 
an unknown distribution, and ki points from V are in the 

b 

Q-ball around Ri. In this case, the higher x := S P(ki) 

i=l 

the more likely it is, that the points V are picked from 
the uniform distribution on Q. Moreover, for a, b — > oo 
and - — > (=>■ ej — > Vi) every local deviation from 
uniformity of P's distribution is captured and results in 
a decrease of X- Finally x is obtained by normalizing x by 
its expected value in the case that Pi , . . . , P a are picked 
independently from the uniform distribution: 



7T = B(c,g,a)= y p a (W) , 

g € [0, 1] being the fraction of Si's volume that is oc- 
cupied by the e-ball. If a priori all g are equiprobable, 
the probability density of a given g is proportional to n. 



i—l i=l j=0 



3=0 



3=0 
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